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considered to involve two steps. First, a lead chemical template (often one or 
more) is selected. Second, a synthetic chemistry effort is undertaken to create 
analogs of the lead chemical template to create a compound or compounds 
possessing the desired therapeutic and pharmacokinetic properties. 



suitable lead chemical template upon which to base a chemistry analog program. 
The process of identifying a lead chemical template for a given molecular target 
typically involves screening a large number of compounds (often more than 
100,000) in a functional assay, selecting a subset based on some arbitrary 
25 activity threshold for testing in a secondary assay to confirm activity, and then 
assessing the remaining active compounds for suitability of chemical 
elaboration. 

This process can be quite time- and resource-consuming, and has 
numerous disadvantages. It requires the development and implementation of a 
30 high-throughput functional assay, which by definition requires that the function 
of the molecular target be known. It requires the testing of large numbers of 
compounds, the vast majority of which will be inactive for a given molecular 
target. It leads to the depletion of chemical resources and requires the continual 
maintenance of large collections of compounds. Importantly, it often leads to a 
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Background of the Invention 

From an organic chemistry standpoint, the process of drug design can be 
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An important step in the drug discovery process is the selection of a 
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final pool of potential lead templates that for the most part, with the exception of 
affinity for a given molecular target, do not possess desirable drug-like qualities. 
In some cases, high-throughput functional assays do not identify any compounds 
from the large number (e.g., 100,000) of compounds screened that meet the 
5 criteria established for activity. 

Thus, what is needed is a faster and better approach to identifying a lead 
chemical template. 

Summary of the Invention 

10 The present invention is related to rational drug design. Specifically, the 

present invention provides an approach to the development of a library of 
compounds as well as methods for identifying compounds (e.g., ligands) that 
bind to a specific target molecule (e.g., proteins) and lead chemical templates 
that can be used, for example, in drug discovery and design. Significantly and 

1 5 preferably, this approach or identifying ligands for target molecules (e.g., 
proteins) uses nuclear magnetic resonance (NMR) spectroscopy. There are 
numerous NMR spectroscopic techniques currently available that detect binding 
of small molecules to targets such as protein targets, including targets identified 
using genomics techniques that lack a functional assay. Ligands with only 

20 moderate binding affinities, which might be overlooked in a traditional 
functional assay but yet might serve as templates for subsequent synthetic 
chemistry efforts, can potentially be identified using the present invention. 
Preferably, one method of the present invention involves the use of flow NMR 
techniques, which can reduce the amount of time and effort required to evaluate 

25 small molecules for binding to a given target. 

In one aspect, the present invention provides a method of creating a 
chemical compound library, and the library itself. The method includes: 
selecting compounds having a molecular weight of no greater than about 350 
grams/mole; and selecting compounds having a solubility in deuterated water of 

30 at least about 1 mM at room temperature. Preferably, a majority (i.e., greater 

than 50%) of the compounds in the chemical compound library have a molecular 
weight of no greater than about 350 grams/mole and a solubility in deuterated 
water of at least about 1 mM at room temperature. More preferably, at least 
about 75% of the compounds, and most preferably, all of the compounds in the 
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chemical compound library have a molecular weight of no greater than about 
350 grams/mole and a solubility in deuterated water of at least about 1 mM at 
room temperature. Preferably, this library of compounds includes at least about 
250 compounds, more preferably, at least about 300 compounds, and most 
5 preferably, at least about 2000 compounds, and have relatively diverse chemical 
structures. Herein, the molecular weights of the compounds are determined 
without solubilizing counterfoils (if the compounds are salts) and without water 
molecules of hydration. Also, concentrations are reported based on aqueous 
solutions, which may or may not include a buffer. 

10 In another embodiment, the present invention provides a method of 

identifying a lead chemical template (of which there often may be one or more), 
for example, for designing a bioactive agent such as a drug (e.g., a compound 
having therapeutic and/or prophylactic capabilities). The method includes: 
selecting compounds having a molecular weight of no greater than about 350 

15 grams/mole, and a solubility in deuterated water of at least about 1 mM at room 
temperature to create a chemical compound library; identifying at least one 
compound from the library that functions as a ligand (i.e., a compound that binds 
to a target molecule) having a dissociation constant to a target molecule (e.g., 
protein) of no weaker than (i.e., at least) about 100 jjM; and using the ligand to 

20 identify a lead chemical template, which can be used, for example, for designing 
a drug. Preferably, the lead chemical template has a dissociation constant to a 
target molecule (e.g., protein) of no weaker than (i.e., at least) about 1 pM. 
Preferably, the lead chemical template can be identified through further 
screening efforts or through direct chemical elaborations. Preferably, a majority 

25 (i.e., greater than 50%) of the compounds in the chemical compound library, 
more preferably, at least about 75%, and most preferably, all of the compounds 
in the chemical compound library, have a molecular weight of no greater than 
about 350 grams/mole and a solubility in deuterated water of at least about 1 
mM at room temperature. 

30 Another embodiment of the present invention provides a method of 

identifying a compound that binds to a target molecule (e.g., protein). The 
method includes: providing a plurality of mixtures of test compounds, each 
mixture being in a (separate) sample reservoir (preferably, a sample reservoir of 
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a multiwell sample holder (e.g., a 96-well microtiter plate)); introducing a target 
molecule (e.g., protein) into each of the sample reservoirs to provide a plurality 
of test samples; providing a nuclear magnetic spectrometer equipped with a 
flow-injection probe; transferring each test sample from the sample reservoir 
5 into the flow-injection probe; collecting a relaxation-edited (preferably, a one- 
dimensional (ID) relaxation-edited) nuclear magnetic resonance spectrum 
(preferably, a *H NMR spectrum) on each sample in each reservoir; and 
comparing the spectra of each sample to the spectra taken under the same 
conditions in the absence of the target molecule (e.g., protein) to identify 

10 compounds that bind to the target molecule (e.g., protein); wherein the 

concentration of target molecule (e.g., protein) and each compound in each 
sample is no greater than about 100 |nM. Preferably, the mixture of compounds 
comprises at least about 3 compounds (more preferably, at least about 6 
compounds, and most preferably, at least about 10 compounds), each having at 

15 least one distinguishable resonance in a one-dimensional (ID) NMR spectrum 
(preferably, a ID l H NMR spectrum) of the mixture. 

Preferably, in this method, the ratio of target molecule (e.g., protein) to 
compounds in each sample reservoir is about 1:1. More preferably, the 
concentration of target molecule (e.g., protein) and each compound in each 

20 sample is at least about 25 joM. Most preferably, the concentration of target 
molecule (e.g., protein) and each compound in each sample is no greater than 
about 50 |-iM. 

Sample requirements can be reduced even further if WaterLOGSY 
(water-ligand observation with gradient spectroscopy) methods are used as an 
25 alternative to the relaxation-editing method described above to detect the binding 
interaction. 

The present invention provides yet another method of identifying a 
compound that binds to a target molecule (e.g., protein). This method includes: 
providing a plurality of mixtures of test compounds, each mixture being in a 
30 sample reservoir; introducing a target molecule into each of the sample 

reservoirs to provide a plurality of test samples; providing a nuclear magnetic 
resonance spectrometer equipped with a flow-injection probe; transferring each 
test sample from the sample reservoir into the flow-injection probe; collecting a 
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WaterLOGSY nuclear magnetic resonance spectrum (preferably, a ID 
WaterLOGS Y nuclear magnetic resonance spectrum) on each sample in each 
reservoir; and analyzing the spectra of each sample to distinguish binding 
compounds from nonbinding compounds by virtue of the opposite sign of their 
5 water-ligand nuclear overhauser effects (NOEs). Preferably, the concentration 
of each compound in each sample is no greater than about 100 |aM, although 
higher concentrations can be used if desired. 

In this method when binding is detected using the WaterLOGSY 
technique, extremely low levels of target can be used with ratios of ligand to 

10 target of about 100:1 to about 10:1. Preferably, the concentration of target 

molecule is no greater than about 10 pM, and can be no greater than about 1 jjM. 
More preferably, the concentration of target molecule is about 1 jiM to about 10 
jiM. For data analysis, binding compounds are distinguished from nonbinders 
(i.e., nonbinding compounds) by the opposite sign of their water-ligand NOEs. 

15 There is no need to collect a reference spectrum in the absence of a target 
molecule. 

In preferred embodiments of the present invention, a majority of the 
compounds in the library have a solubility in deuterated water of at least about 1 
mM at room temperature (i.e., about 25°C to about 30°C), and a molecular 

20 weight of no greater than about 350 grams/mole. For effective use of a 
compound identified as a ligand for a given target in the search for a lead 
chemical template, preferably, the dissociation constant of the identified ligand 
to a target molecule is no weaker than (i.e., at least) about 100 |nM. For effective 
use of a lead chemical template in further drug design, preferably, the 

25 dissociation constant for the lead chemical template to a target molecule is no 
weaker than (i.e., at least) about 1 jxM. 

Brief Description of the Drawings 

Figure 1 . Schematic diagram illustrating the use of NMR to discover a 
30 ligand having an approximate dissociation constant of 1 .0 x 10" 4 M (left figure), 
to use the discovered ligand to direct the discovery of a lead chemical template 
having an approximate dissociation constant of 1.0 x 10" 6 M (middle figure), and 
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then via synthetic chemistry and structure-directed drug design arrive at a drug 
candidate having an approximate dissociation constant of 1 .0 x 10" 8 M. 

Figure 2. Comparison of the two-dimensional HA (hydrogen-bond 
acceptor) vs. CHRG (charge) BCUT plots for the compounds contained in the 
NMR library described herein (dark squares) and a larger chemical library 
database (gray spots). 

Figure 3 A. One-dimensional relaxation-edited *H NMR spectrum of a 
compound set containing three compounds designated (1), (2), and (3). 
Resonances are numbered corresponding to the individual components in the set. 

Figure 3B. One-dimensional relaxation-edited l VL NMR spectrum of the 
same set of compounds shown in Figure 3 A in the presence of flavodoxin. 
Arrows identify resonances that experience a significant reduction in intensity. 

Figure 4A. Region of the 2D *H- 15 N HSQC spectrum of flavodoxin 
alone and in the presence of a 10-fold excess of compound (1). Residues with 
significant chemical shift changes in the presence of (1) are boxed and labeled 
with their amino acid type and sequence number. 

Figure 4B. Secondary structure representation of the flavodoxin global 
fold. The flavin cofactor is shown in stick format. Residues with the largest 
chemical shift changes in the presence of (1) are shown in white. 

Figure 5 A. One-dimensional relaxation-edited *H NMR spectrum of a 
compound set containing three compounds in the presence of flavodoxin. 

Figure 5B. One-dimensional relaxation-edited *H NMR spectrum of the 
same compound set shown in Figure 5 A in the presence of the antibacterial 
target protein. Arrows identify resonances from Ligand A (Figure 6) that 
experience a significant reduction in intensity in the presence of the antibacterial 
target protein. 

Figure 6. IC50 values of the original ligand, Ligand A, and four 
structurally related compounds, Ligands B-E, identified in a similarity search 
based on the structure of Ligand A. 

Figure 7. Region of the 2D *H- 15 N HSQC spectrum of the antibacterial 
target protein alone and in the presence of a 10-fold excess of Ligand A. Several 
resonances with large chemical shift changes in the presence of Ligand A are 
boxed and labeled with their amino acid sequence number. 
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Figure 8 A. One-dimensional relaxation-edited H NMR spectrum of a 
compound set containing ten compounds. 

Figure 8B. One-dimensional relaxation-edited J H NMR spectrum of the 
same set of compounds in Figure 8A in the presence of the antiviral target 
5 protein. Arrows identify resonances, all belonging to the same compound, that 
experience a significant reduction in intensity in the presence of the antiviral 
target protein. 

Figure 9. Region of the 2D ] H- 15 N HSQC spectrum of the antiviral 
target protein alone and in the presence of the ligand identified from Figure 8. 

10 Several resonances with large chemical shift changes in the presence of this 
ligand are boxed and labeled with their amino acid sequence number. 

Figure 10. Schematic of the BEST flow system: (1) NMR console, (2) 
computer workstation, (3) Gilson sample handler, (4) flow probe in the magnet, 
and (5) nitrogen gas. The Gilson sample handler is labeled as follows: (A) 

15 keypad, (B) syringe, (C) injector, (D) solvent reservoir, (E) solvent rack, (F) 
sample racks, (G) waste reservoir, (H) Rheodyne valves, (I) injection port, and 
(J) recovery unit. 

Figure 1 1 . Schematic of a Bruker flow probe showing (A) the total probe 
volume, (B) the flow cell volume, and (C) the positioning volume. 
20 Figure 12. 600.13 MHz *H NMR spectra of a 100 NMR library 

sample with the positioning volume set to (A) -100 jjI, (B) 0 jal, and (C) +100 
Hi. 

Figure 13. Overlay of the two-dimensional HA (hydrogen-bond 
acceptor) vs. CHRG (charge) BCUT plots for the compounds in the CMC index 

25 (gray) and the lead-like compounds contained therein (black). 

Figure 14. Regions of the 600.13 MHz relaxation-edited *H NMR 
spectra of a nine compound mixture (A) without and (B) with added target 
protein. Protein and each ligand were 50 ^iM. Spectra were acquired on a 
Bruker 5 mm flow-injection probe at 27°C. A total of IK scans were collected 

30 resulting in a total acquisition time of about 60 minutes per spectrum. A 

relaxation filter of 174 milliseconds (ms) was used. Arrows identify resonances 
that disappear in the presence of protein. 
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Figure 15. Region of the 600.13 MHz WaterLOGSY spectrum of a 
compound mixture with added target protein. The concentration of protein was 
10 nM while the concentration of each compound was 100 pM. The spectrum 
was acquired on a Broker 5 mm flow-injection probe at 27°C. A total of 4K 
5 scans were collected resulting in a total acquisition time of about 288 minutes. 
A mixing time of 2.0 seconds was used. 

Figure 16. Regions of the 600.13 MHz relaxation-edited *H NMR 
spectra of a single compound (A) without (A) and (B) with added target protein. 
Protein and ligand were 50 |uM. Spectra were acquired on a regular Broker 5 
10 mm TXI probe at 27°C. A total of 512 scans were collected resulting in a total 
acquisition time of about 30 minutes per spectrum. A relaxation filter of 174 ms 
was used. 

Detailed Description of Preferred Embodiments of the Invention 

15 The present invention involves the selection of a generally small library 

of structurally diverse compounds that are generally water soluble, have a 
relatively low molecular weight, and are amenable to synthetic chemistry 
elaboration. Significantly and advantageously, for certain embodiments, the 
present invention preferably involves carrying out a binding assay at relatively 

20 low concentrations of target and near equimolar ratios of ligand to target, or even 
at extremely low concentrations of target and higher ratios of ligand to target. 

In a method of the present invention, a relatively small subset of 
compounds (preferably, at least about 250, more preferably, at least about 300, 
most preferably, at least about 2000, and typically no more than about 10,000) 

25 that mimics the structural diversity of compounds in much larger collections is 
created based on a predetermined set of criteria. This generally small library is 
screened for binding affinity to a target molecule (as determined herein by 
dissociation constants). The compounds from the library that are identified to be 
effective ligands (typically, having an affinity for a desired target as evidenced 

30 by a dissociation constant of at least about 1 .0 x 10" 4 M) are then used to focus 
further screening efforts or to direct chemical elaborations to arrive at one or 
more lead chemical templates (which, typically have an affinity for a desired 
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target as evidenced by a dissociation constant of at least about 1.0x10 M). 
This process is shown schematically in Figure 1. 

Significantly, time and resources are saved by screening far fewer 
compounds using the present invention. Use of a binding assay, such as the one 
5 based on NMR spectroscopy described herein, eliminates the need to develop a 
high-throughput functional assay, and also allows the methods to be used on 
molecular targets lacking a known function. 

Thus, the present invention provides methods of identifying a compound 
that binds to a target molecule (preferably, a protein) that are based on NMR 
1 0 spectroscopy techniques. Such methods typically involve the use of relaxation- 
editing techniques, for example, which involve monitoring changes in chemical 
intensities (preferably, significant reductions in intensities) of the target 
molecule upon the addition of a test compound. Preferably, the relaxation- 
editing techniques are one-dimensional, and more preferably, one-dimensional 
15 *H NMR techniques. Alternatively, such methods can involve the use of 

WaterLOGSY. This involves the transfer of magnetization from bulk water to 
detect the binding interaction. Using WaterLOGSY techniques, binding 
compounds are distinguished from nonbinders by the opposite sign of their 
water-ligand nuclear overhouser effects (NOEs). 
20 Important elements that contribute to the success of the methods of the 

invention preferably include developing a suitable small library of compounds to 
screen, carrying out the binding assay at low concentrations of target and near 
equimolar ratios of ligand to target (for relaxation-editing), or at extremely low 
concentrations of target (if desired) and higher ratios of ligand to target (for 
25 WaterLOGSY), and the capacity for rapid throughput of data collection. For 
example, for relaxation-editing NMR techniques, the concentration of target 
molecule is preferably no greater than about 1.0 x 10^ M, and for WaterLOGSY 
NMR techniques, the concentration of target molecule is preferably no greater 
than about 10 |aM, although higher concentrations can be used if desired. 
30 The selection of compounds in a small library (preferably, at least about 

250 compounds, more preferably, at least about 300 compounds, and most 
preferably, at least about 2000 compounds) is important in that its diversity 
should mimic the diversity of larger compound collections. Preferably, each 
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component possesses many of the desirable qualities of a lead chemical 
template. These include water solubility, low molecular weight (preferably, no 
greater than about 350 grams/mole, more preferably, no greater than about 325 
grams/mole, and most preferably, less than about 325 grams/mole), and 
5 amenability to synthetic chemistry elaboration. Templates possessing these 
qualities, as compared to a template selected randomly, are preferably 
considered to be predisposed to being drug-like and having an increased 
likelihood of ultimately leading to a drug. 

Good structural diversity in a library increases the likelihood that one or 

1 0 more compounds will possess structural characteristics important for binding to 
a given molecular target. Predisposing the compounds to be water soluble, to 
have low molecular weight (preferably, no greater than about 350 grams/mole, 
more preferably, no greater than about 325 grams/mole, and most preferably, 
less than about 325 grams/mole), and to be amenable to synthetic elaboration 

1 5 increases the likelihood that a compound found to be a ligand will lead to a 

related compound or compounds suitable as a lead chemical template for use, for 
example, in a process of identifying an effective therapuetic and/or prophylactic 
agent. Additionally, the requirement for good water solubility (preferably, at 
least about 1 .0 x 10~ 3 M in deuterated water at room temperature) is important in 

20 that it increases the likelihood of success of other downstream drug-design 
projects, such as co-crystallization attempts, calorimetry studies, and enzyme 
kinetic analyses. 

Carrying out a relaxation-editing binding assay (preferably, a ID *H 
NMR assay) at low concentrations of target (preferably, no greater than about 

25 1 .0 x 1 0" 4 M, and more preferably, no greater than about 5.0 x 1 0' 5 M) and near 
equimolar ratios of ligand to target creates the requirement that compounds 
testing positive for binding have affinities within a factor of about 3-4 of this 
same concentration (preferably, having a dissociation constant of no less than 
about 2.0 x 10" 4 M). A similar affinity threshold can be obtained by carrying out 

30 a WaterLOGSY based binding assay at even lower target concentrations 

(preferably, no greater than about 10 ^M, and can be no greater than about 1 
jaM, but is more preferably about 1 |aM to about 10 joM) and ligand to target 
ratios of about 1 00: 1 to about 10:1. This level of affinity is desired if the 
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subsequent steps of focused screening and directed chemical elaboration are to 
be successful in elucidating a lead chemical template with very low affinity (e.g., 
one having a dissociation constant of at least about 1.0 x 10* 6 M). Carrying out 
the initial screening at these low concentrations also avoids detection of 
5 unwanted compounds with much smaller dissociation constants in the 1 .0 x 1 0" 
M range, which are less specific in their binding and therefore harder to turn into 
lead chemical templates given their weak affinity initially. 

The capacity for rapid throughput of data collection is important if a 
large number of molecular targets are to be screened. Preferably, flow NMR 

10 techniques can reduce the amount of time and effort required to evaluate small 
molecules for binding to a given target. For example, the use of a Bruker 
Efficient Sample Transfer system in combination with a tubeless, flow-injection 
NMR probe has proven to be much faster and less labor intensive than the use of 
traditional NMR tubes. A significant increase in throughput is obtained 

1 5 compared to both manual sample changing and to using an autosampler. 

Implementation of the screening process using multiwell sample holders also 
standardizes the experimental setup as well as the components in a given mixture 
from one molecular target to the next. 

The following is a description of a preferred method for carrying out the 

20 present invention. It is provided for exemplification purposes only and should 
not be considered to unnecessarily limit the invention as set forth in the claims. 

In the design of a preferred small library of structurally diverse 
compounds according to the present invention, compounds were selected from a 
large library based on dissimilarity, predicted water solubility, low molecular 

25 weight, and chemical intuition. Some were based on frameworks suggested in 
the literature, although some literature-suggested frameworks were consciously 
avoided. Each compound was tested for solubility at 1 .0 x 10* 3 M in 2 H20 and 
for purity by mass spectrometry and *H NMR spectroscopy. Compounds 
deemed to be water soluble and pure were kept for inclusion in the final library 

30 (approximately 30% of the initial compounds). The resulting library contains 
approximately 300 compounds. One measure of the degree of structural 
diversity of the compounds in this small library is shown in Figure 2. This is 
based on the technique described in Pearlman et al., Perspectives in Drug 
Discovery & Design, 9, 339-353 (1998). Preferably, the compound library 

11 
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includes compounds of sufficiently diverse chemical structure that one would 
expect at least one compound to bind to a given target protein with an affinity 
(dissociation constant) no weaker than (i.e., at least) about 200 jaM. Herein, 
compounds of diverse chemical structure are those that have a variety of 
5 backbone hydrocarbon structures (e.g., linear, branched, cyclic - which may or 
may not be aromatic, have fused rings, etc.), optionally including a variety of 
heteroatoms (e.g., oxygen, nitrogen) and a variety of functional groups (e.g., 
carbonyls) in a variety of positions (e.g., pointing in various directions at a 
variety of distances from each other). Ideally, using the technique described in 
10 Pearlman et aL, Perspectives in Drug Discovery & Design, 9, 339-353 (1998), 
the library of compounds displays a pattern of well-dispersed black squares (e.g., 
see Figure 2). 

In order to increase the throughput of the NMR screening, compounds 
were grouped into 32 sets of 6-10 compounds that have at least one 

1 5 distinguishable resonance in a ID *H NMR spectrum of the mixture. To 

accomplish this, a ID *H NMR spectrum was obtained of each mixture in 100% 
2 H20 and in 0.1 M sodium phosphate/ 100% 2 H2<3 at pH 6.5. Two solvents were 
used in order to determine the assignment of pH-titratable resonances in the 
spectrum. Each of the 32 mixtures was then plated out into separate wells of a 

20 96- well plate, using 25 ^iL of a 1.0 x 10" 3 M solution, and frozen at -80°C until 
needed. In an initial version of the NMR screening library, approximately 70 
compounds were grouped into 21 sets of 3-4 compounds each. 

After a 96- well plate had completely thawed, a solution containing a 
molecular target protein was added to each well containing a mixture of 

25 compounds in the 96-well plate. The final concentration of protein is typically 
about 5.0 x 10" 5 M. The ratio of each compound in a mixture to protein is 
typically about 1:1. This process typically involves adding 475 mL of protein to 
each mixture. Dispersion throughout the mixture was facilitated by shaking the 
96-well plate for 20 minutes following addition of protein. 

30 AID relaxation-edited *H NMR spectrum was collected on each 

protein/compound mixture solution using a Bruker DRX600 or a Bruker 
AMX400 spectrometer equipped with a shielded magnet, a Gilson sample 
handler, and a 5 mm (250 jiL sample cell) flow-injection NMR probe. The use 
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of a shielded magnet greatly reduces the magnetic fringe field surrounding the 
high field magnet and allows the Gilson sample handler to be placed in close 
proximity to the magnet. The Gilson liquid sample handler transfers samples 
from 96- well plates into the flow-injection probe and, if desired, returns the 
5 samples back to the 96- well plate. A compound or compounds that bind to a 
given target are identified by comparing the ID relaxation-edited *H NMR 
spectrum collected in the presence of added protein to that of the identical 
mixture of compounds in the absence of protein. A compound is identified as a 
ligand for a given target if one or more of its resonances (preferably *H 

10 resonance or resonances) are significantly reduced (i.e., greater than about 75% 
reduction in one or more resonances) in intensity in the presence of target 
molecule (e.g., protein) as compared to the spectrum collected in an identical 
fashion in the absence of target molecule (e.g., protein). 

Sample requirements can be reduced even further if WaterLOGS Y 

1 5 methods are used as an alternative to the relaxation-editing method described 
above to detect the binding interaction. WaterLOGS Y is described in more 
detail in C. Dalvit et al., J. Biomol NMR, 18, 65-68 (2000). 

Since the WaterLOGS Y experiment relies on the transfer of 
magnetization from bulk water to detect the binding interaction, it is a very 

20 sensitive technique. As such, the concentration of protein in each sample can be 
reduced to no greater than about 10 jjM while the concentration of each 
compound can be about 100 nM. This results in ratios of target molecule to 
compounds in each sample reservoir of about 100:1 to about 10:1. The exact 
concentrations and ratios used can vary depending on the size of the target 

25 molecule, the amount of target molecule available, the desired binding affinity 
detection limit, and the desired speed of data collection. In contrast to the 
relaxation-editing method, there is no need to collect a comparison or control 
spectrum to identify binding compounds from nonbinders. 

Instead, binding compounds are distinguished from nonbinders by the 

30 opposite sign of their water-ligand nuclear overhouser effects (NOEs). Ligand 
binding was confirmed by making fresh solutions containing only the identified 
ligand, with and without added protein at a 1 : 1 ratio, and comparing the ID 
relaxation-edited *H NMR spectra. In addition, the ligand's dissociation 
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constant was estimated by analyzing several ID diffusion-edited H NMR 
spectra collected at several gradient strengths. The relative diffusion coefficients 
for the protein, for the ligand in the presence of protein, and for the ligand in the 
absence of protein, in conjunction with known protein and ligand concentrations, 
5 were used to estimate the ligand' s dissociation constant. These spectra are 
typically collected using an NMR spectrometer, a conventional high resolution 
probe, and regular 5 mm NMR tubes. 

Once a ligand had been identified and confirmed, its structure is used to 
identify available compounds with similar structures to be assayed for activity or 

10 affinity, or to direct the synthesis of structurally related compounds to be assayed 
for activity or affinity. These compounds are then either obtained from 
inventory or synthesized. Most often, they are then assayed for activity using 
enzyme assays. In the case of molecular targets that are not enzymes or that do 
not have an enzyme assay available, these compounds can be assayed for affinity 

1 5 using NMR techniques similar to those described above, or by other physical 
methods such as isothermal denaturation calorimetry. Compounds identified in 
this step with affinities for the molecular target of about 1 .0 x 1 0" 6 M are 
typically considered lead chemical templates. 

In some instances, ligand binding is further studied using more complex 

20 NMR experiments or other physical methods such as calorimetry or X-ray 
crystallography. These downstream studies have a greater chance of success 
since the ligands and lead chemical templates so identified are fairly water 
soluble. For instance, if [ 15 N]protein is available, 2D 2 H- 15 N HSQC 
(heteronuclear single quantum correlation) spectra can be collected with and 

25 without added ligand to locate the ligand' s binding site on the protein. In cases 
where the protein is small enough (molecular weight less than about 30,000) and 
further characterization of protein/ligand interactions is desired, 3D NMR 
experiments can be carried out on [ 13 C/ 15 N]protein/[ 12 C/ 14 N]ligand complexes. 
Attempts to soak lead chemical templates identified by this method into existing 

30 protein crystals, or to form co-crystals, can also be carried out. 



Examples 

Objects and advantages of this invention are further illustrated by the 
following examples, but the particular materials and amounts thereof recited in 
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these examples, as well as other conditions and details, should not be construed 
to unduly limit this invention. 

Example 1. Use of NMR Spectroscopy to Identify Ligands for Flavodoxin 

5 Reference ID l H NMR spectra of the individual compounds and 

combinations of compounds were recorded in 2 H 2 0 solution on a Bruker ARX- 
400 spectrometer. One-dimensional relaxation-edited *H NMR spectra of 
samples containing a mixture of flavodoxin and a given compound combination 
were recorded in 2 H 2 0 solution on a Bruker DRX-500 spectrometer. A spin lock 

10 time of 350 milliseconds was used. The screening experiments were carried out 
on solutions that were 5.0 x 10" 5 M flavodoxin and 1.0 x 10" 4 M of each ligand 
present. Two-dimensional *H- 15 N HSQC spectra were recorded in solution 
on a Bruker DRX-500 spectrometer. Samples were 5.0 x 10' 5 M flavodoxin with 
a 3-10 fold excess of a given ligand. All solutions containing flavodoxin were 

1 5 buffered with 1 .0 x 10~ 2 M phosphate at pH 6.4. The Desulfovibrio vulgaris 
flavodoxin used in all experiments was 15 N-enriched. 

To create the NMR ligand screening library, an initial set of compounds 
was selected by a search of a larger library of compounds based on dissimilarity, 
predicted water solubility, low molecular weight (preferably, no greater than 

20 about 350 grams/mole, more preferably, no greater than about 325 grams/mole, 
and most preferably, less than about 325 grams/mole), and chemical intuition. 
These compounds were then tested for water solubility and purity. Compounds 
with no visible precipitate or suspension at a concentration of 1 .0 x 10" M were 
deemed to be water soluble. Compounds with the predicted parent ion molecular 

25 weight and otherwise normal mass spectra were deemed to be pure. Reference 
ID *H NMR spectra were collected on compounds meeting these criteria. 
Combinations of three or four compounds were then assembled in which at least 
one distinguishing *H NMR resonance for each compound could be readily 
identified. A reference ID *H NMR spectrum was then recorded for each 

30 combination of compounds. As an example, three compounds, designated here 
as (1), (2), and (3), were combined into one set. The ID *H NMR spectrum of 
this combination set is illustrated in Figure 3 A. Resonances from each of the 
individual components are readily identified, especially in the aliphatic region of 
the spectrum. At the time of this work, the NMR ligand library contained 
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approximately 70 compounds incorporated into 21 unique assortments 
containing three or four compounds each. 

One-dimensional relaxation-edited *H NMR spectroscopy was used to 
screen the library for binding to the model target protein, Desulfovibrio vulgaris 
5 flavodoxin. For most of the compound combinations in the presence of 

flavodoxin, there was little or no reduction in resonance intensity with the 350- 
millisecond spin-lock time. However, for two of the compound combinations, 
the intensities of resonances corresponding to one of the compounds in the 
mixture were significantly reduced. Figure 3B exemplifies this for the same 

10 combination illustrated in Figure 3 A. The resonances corresponding to (2) and 
(3) are not affected by the spin-lock filter in the presence of flavodoxin. 
However, the two aliphatic resonances of (1) at 1 .8 ppm and 3.7 ppm are 
significantly reduced in intensity by the spin-lock filter in the presence of 
flavodoxin, indicating that (1) is binding to the protein. Similar experiments 

1 5 indicated that a second compound, contained within a different combination of 
compounds, also binds to flavodoxin. These were the only two compounds 
among those tested that clearly bind to flavodoxin. 

Two-dimensional 1 H- 15 N HSQC spectra were subsequently recorded on 
[ 15 N]flavodoxin to further investigate the interaction of these two ligands with 

20 the protein. Since amide backbone *H and 15 N resonance assignments for this 
protein are known (Stockman et al., J. Biomol NMR, 3, 133-149 (1993)), 
analysis of the ligand-induced changes in *H and 15 N chemical shifts could be 
used to identify the ligand binding sites. Typical chemical shift changes 
observed are delineated in Figure 4A, which shows an overlay of the 1 H- 15 N 

25 HSQC spectra of flavodoxin alone and in the presence of excess (1). Residues 
with the largest ligand-induced chemical shift changes are indicated in white on 
the structure of the protein (Watt et al., J. Mol Biol, 218, 195-208 (1991)) in 
Figure 4B. Compound (1) binds near the flavin cofactor binding site. 
Interestingly, the binding sites as defined by this data for the two ligands 

30 identified are at adjacent, partially overlapping locations on the surface near the 
flavin cofactor binding site. 
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Example 2. Use of NMR Spectroscopy to Identify a Lead Chemical 
Template for an Antibacterial Target Protein 

Numerous protein targets are amenable to an NMR process of identifying 
a lead chemical template. In this example, the technique is illustrated for an 
5 antibacterial target protein with a molecular weight of about 20 kDa. 

All solutions containing the antibacterial target protein were buffered 
with 2.5 x 10~ 2 M phosphate at pH 7.4. The protein used for the ID screening 
and dissociation constant determination experiments was unlabeled, while that 
used for the 2D *H- 15 N HSQC experiments was 15 N-enriched. 
10 One-dimensional relaxation-edited ! H NMR spectra of samples 

containing a mixture of the target protein and a given compound combination 
were recorded in 2 H20 solution on a Bruker DRX-500 spectrometer. A spin lock 
time of 350 milliseconds was used. The screening experiments were carried out 
on solutions that were 1 .0 x 10" 4 M target protein and 1 .0 x 10^ M of each 
15 ligand. The library used for the screening process was identical to that described 
in Example 1 . 

Two-dimensional *H- 15 N HSQC spectra were recorded in 1 H20 solution 
on a Bruker DRX-500 spectrometer. Samples contained 8.0 x 10" 5 M target 
protein with a 9-10 fold excess of a given ligand. 

20 Ligand dissociation constants were estimated by determining relative 

diffusion coefficients for target protein alone, ligand in the absence of target 
protein, and ligand in the presence of target protein (Lennon et al., Biophys. J., 
67, 2096-2109 (1994)). Relative diffusion coefficients were determined using 
pulsed-field-gradient NMR experiments incorporating a bipolar longitudinal 

25 eddy-current delay sequence (Wu, J. Magn. Reson. Ser. A, 115, 260-264 (1995)). 

One-dimensional relaxation-edited *H NMR spectroscopy was used to 
screen the small molecule library for binding to this target protein in a manner 
analogous to that previously described in Example 1 . With this technique, a 
reduction in resonance intensity is observed if a compound interacts with the 

30 target protein, thus identifying it as a ligand. For most of the compound 

combinations in the presence of the antibacterial target protein, there was little or 
no reduction in resonance intensity with the 350-millisecond spin-lock time. 
However, for some of the compound combinations, the intensities of resonances 
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corresponding to one of the compounds in the mixture were significantly 
reduced. The results from one such compound combination are described here. 

As a control, the ID relaxation-edited *H NMR spectrum of a certain 
mixture in the presence of a different protein, flavodoxin, is shown in Figure 5 A. 
5 All ligand resonances are observed with full intensity. The corresponding ID 
relaxation-edited *H NMR spectrum of this same mixture acquired in the 
presence of the antibacterial target protein is shown in Figure 5B. The 
intensities of all resonances corresponding to Ligand A in Figure 5B are clearly 
reduced in the presence of the antibacterial target protein. This indicates that 
10 Ligand A is binding to the protein. The binding is specific to the antibacterial 
target protein since the resonance intensities are not reduced in the presence of 
flavodoxin. 

Binding of Ligand A was confirmed by repeating the relaxation-filtered 
experiments on a solution containing protein and just Ligand A. Using this same 
15 sample, as well as samples of protein alone and Ligand A alone, a separate set of 
experiments that use pulsed-field-gradient techniques was collected to determine 
relative diffusion coefficients. From this data, the dissociation constant for 
Ligand A was estimated by NMR measurements to be approximately 1 .4 x 10" 4 
M. 

20 In order to ascertain whether the binding of Ligand A and structurally 

related analogs inhibited the activity of this enzyme, and if so to what degree, 
IC 50 values were determined. To determine IC50 values, various concentrations 
of selected compounds, originally prepared at 1.0 x 10" 2 M in 100% DMSO, 
were titered out to provide at least 12 individual concentrations. Twenty five 

25 (25) ^iL of each solution (15% DMSO maximum) were added to wells in a 96- 
well plate, followed by 100 microliters (jiL) of a cocktail containing 100 
nanograms (ng) of target protein at pH 7.0. Finally, 25 \xL of substrate solution 
was added and the plate (Immulon 2, Dynex) was read in 15 second intervals at 
405 nanometers (nm) on a Spectramax 250 plate reader. IC 50 profiles and values 

30 were generated using the program Softmax. 

Ligand A was shown to inhibit this enzyme with an IC50 value of 
approximately 9.0 x 10" 5 M. Subsequently, a similarity search resulted in the 
testing of about 10 structurally related compounds for enzyme inhibition. As 
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shown in Figure 6, four of these compounds had IC50 values between 2.0 x 10" 
M and 1 .0 x 10" 6 M. These very low affinity compounds can serve as lead 
chemical templates for the design of drugs directed against this molecular target. 
Two-dimensional 1 H- 15 N HSQC spectra were subsequently recorded on 
5 [* ^target protein with and without Ligand A present to further investigate the 
interaction of this ligand with the protein. Chemical shift changes observed in 
the presence of Ligand A are delineated in Figure 7, which shows an overlay of 
the 1 H- 15 N HSQC spectra of protein alone and in the presence of a 10-fold 
excess of ligand. Residues with the largest ligand-induced chemical shift 

1 0 changes are boxed. 

In this study, a ligand that binds to an antibacterial target protein with a 
dissociation constant of less than about 2.0 x 10" 4 M was identified from a small 
library of compounds. No prior knowledge of what types of ligands ought to 
bind to this protein was used. The identified ligand was shown to inhibit this 

15 enzyme with an IC 50 value of approximately 9.0 x 10" 5 M. Subsequently, a 

similarity search based on the structure of this NMR-identified ligand resulted in 
the testing of about 10 structurally related compounds for enzyme inhibition. 
Four of these compounds had IC50 values between about 2.0 x 10" 5 M and about 
1.0 x 10" 6 M. These very low affinity compounds can serve as lead chemical 

20 templates for the design of drugs directed against this molecular target. More 
extensive NMR experiments, using isotopically-enriched target protein, 
concluded that the compounds identified as lead chemical templates do in fact 
bind to the active site of the target protein. 

25 Example 3. Use of NMR Spectroscopy to Identify a Lead Chemical 

Template for an Antiviral Target Protein 

Numerous protein targets are amenable to this NMR process of 
identifying a lead chemical template. In this example, the technique is illustrated 
for an antiviral target protein with a monomer molecular weight of 
30 approximately 8 kDa that exists as a dimer in solution. This target protein was 
screened using an NMR screening library and flow NMR spectroscopy. 

All solutions containing the antiviral target protein were buffered with 
2.0 x 10 2 M phosphate at pH 6.5. The protein used for the ID screening and 
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dissociation constant determination experiments was unlabeled, while that used 
for the 2D ! H- 15 N HSQC experiments was 15 N-enriched. 

One-dimensional relaxation-edited *H NMR spectra of samples 
containing a mixture of the target protein and a given compound combination 
5 were recorded in 2 H 2 0 solution on a Bruker AMX-400 spectrometer. The 

spectrometer was equipped with a shielded magnet, a Gilson sample handler, and 
a 5 mm (250 jaL sample cell) flow-injection NMR probe. A spin lock time of 
350 milliseconds was used. The screening experiments were carried out on 
solutions that were 3.8 x 10" 5 M target protein and 5.0 x 10' 5 M of each ligand. 

10 All solutions were contained in a 96-well plate and were delivered to the 5 mm 
flow-injection probe using the Gilson sample handler. The library used for the 
screening process was expanded from that described in the first two examples. It 
contained approximately 300 compounds grouped into 32 separate mixtures. 

Two-dimensional *H- 15 N HSQC spectra were recorded in solution 

15 on a Bruker DRX-500 spectrometer. Samples contained 8.3 x 1 0" 4 M target 
protein alone or in the presence of a given ligand. 

Ligand dissociation constants were estimated by determining relative 
diffusion coefficients for target protein alone, ligand in the absence of target 
protein, and ligand in the presence of target protein (Lennon et al., Biophys. J. , 

20 67 a 2096-2109 (1994)). Relative diffusion coefficients were determined using 
pulsed-field-gradient NMR experiments incorporating a bipolar longitudinal 
eddy-current delay sequence (Wu, J. Magn, Resort. Ser. A> 115, 260-264 (1995)). 

One-dimensional relaxation-edited ! H NMR spectroscopy was used to 
screen the expanded small molecule library for binding to this antiviral target 

25 protein in a manner analogous to that previously described in the first two 

examples. With this technique, a reduction in resonance intensity is observed if 
a compound interacts with the target protein, thus identifying it as a ligand. For 
most of the compound combinations in the presence of the antiviral target 
protein, there was little or no reduction in resonance intensity with the 350- 

30 millisecond spin-lock time. However, for some of the compound combinations, 
the intensities of resonances corresponding to one of the compounds in the 
mixture were significantly reduced. The results from one such compound 
combination are described here. 
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As a control, the ID relaxation-edited *H NMR spectrum of a certain 
mixture in the absence of protein is shown in Figure 8A. All resonances are 
observed with full intensity. The corresponding ID relaxation-edited *H NMR 
spectrum acquired in the presence of the antiviral target protein is shown in 
5 Figure 8B. The intensities of all resonances corresponding to a single compound 
in Figure 8B are clearly reduced in the presence of the antiviral target protein. 
This indicates that this compound is binding to the protein. The binding is 
specific to the antiviral target protein since the resonance intensities are not 
reduced in the presence of other protein targets that have been screened. 
10 In a separate set of experiments that use pulsed-field-gradient techniques 

to determine relative diffusion coefficients, the dissociation constant for the 
identified ligand was estimated by NMR measurements to be approximately 40 
^iM. 

Two-dimensional ! H- 15 N HSQC spectra were subsequently recorded on 
1 5 [ 15 N]target protein with and without the identified ligand present to further 

investigate the interaction of this ligand with the protein. Chemical shift changes 
observed in the presence of this ligand are delineated in Figure 9, which shows 
an overlay of the *H- 15 N HSQC spectra of protein alone and in the presence of 
ligand. Residues with the largest ligand-induced chemical shift changes are 
20 labeled. 

Example 4. Screening of Compound Libraries for Protein B inding Using 
Flow-Iniection NMR Spectroscopy 

Introduction 

25 Flow NMR spectroscopy techniques are becoming increasingly utilized 

in drug discovery and development (B. J. Stockman, Curr. Opin. Drug Disc. 
Dev., 3, 269-274 (2000)). The technique was first applied to couple the 
separation characteristics of liquid chromatography with the analytical 
capabilities of NMR spectroscopy (N. Watanabe et al., Proc. Jpn. Acad Ser B, 

30 54, 194 (1978)). Since then, HPLC-NMR, or LC-NMR as it is more commonly 
referred to, has been broadly applied to natural products biochemistry, drug 
metabolism and drug toxicology studies (J. C. Lindon et al., Prog. NMR Spectr., 
29, 1 (1996); J. C. Lindon et al., Drug. Met. Rev., 29, 705 (1997); B. Vogler et 
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al., J. Nat Prod., 61, 175 (1998); and J.-L. Wolfender et al., Curr. Org. Chem. 2, 
575 (1998)). The wealth and complexity of data made available from the latter 
two applications have created the potential for NMR-based metabonomics to 
complement genomics and proteomics (J. K. Nicholson et al., Xenobiotica, 29, 
5 1181 (1999)). Stopped-flow analysis in LC-NMR, where the chromatographic 
flow is halted to obtain an NMR spectrum with higher signal-to-noise and then 
restarted when the spectrum has finished collecting, was the forerunner to the 
flow-injection systems that will be described here. The largest difference 
between the two systems is that one includes a separation component (LC 

10 column) and the other does not. The rapid throughput possible for combinatorial 
chemistry samples and protein/small molecule mixtures has allowed flow- 
injection NMR methods to impact medicinal chemistry and protein screening (P. 
A. Keifer, Drugs Fut, 23, 301 (1998); P. A. Keifer, Drug Disc. Today, 2, 468 
(1997); P. A. Keifer, Curr. Opin. Biotech., 10, 34 (1999); K. A. Farley et al., 

15 SMASH'99, Argonne, IL, 15-18 August 1999; and A. Ross et al., Biomol NMR, 
16, 139(2000)). 

Changes in chemical shifts, relaxation properties or diffusion coefficients 
that occur upon the interaction between a protein and a small molecule have 
been documented for many years (for recent reviews see M. J. Shapiro et al., 

20 Curr. Opin. Drug. Disc. Dev., 2, 396 (1999); J. M. Moore, Biopolymers, 51, 221 
(1999); and B. J. Stockman, Prog. NMR Spectr., 33, 109 (1998)). Observables 
typically used to detect or monitor the interactions are chemical shift changes for 
the ligand or isotopically-enriched protein resonances (J. Wang et al., 
Biochemistry, 31, 921 (1992)), or line broadening (D. L. Rabenstein, et al., J. 

25 Magn. Resort., 34, 669 (1979); and T. Scherf et al., Biophys. J., 64, 754 (1993)), 
change in sign of the NOE from positive to negative (P. Balaram et al., J. Am. 
Chem. Soc, 94, 4017 (1972); and A. A. Bothner-By et al., Ann. NY Acad. Set 
222, 668 (1972)), or restricted diffusion (A. J. Lennon et al., Biophys., J. 67, 
2096 (1994)) for the ligand. For the most part, these studies have focussed on 

30 protein/ligand systems where the small molecule was already known to be a 

ligand or was assumed to be one. In the last several years, however, the work of 
the Fesik (S. B. Shuker et al., Science, 274, 1531 (1996); and P. J. Hajduk et al., 
J. Am. Chem. Soc, 119, 12257 (1997)), Meyer (B. Meyer et al., Eur. J. 
Biochem., 246, 705 (1997)), Moore (J. Fejzo et al., Chem. BioL, 6, 755 (1999)), 



Shapiro (M. Lin et al., J. Org. Chem., 62, 8930 (1997)), and Dalvit (C. Dalvit et 
al., J. Biomol NMR, 18, 65-68 (2000)) labs has demonstrated the applicability of 
these same general methods as a screening tool to identify ligands from mixtures 
of small molecules. 

5 These screening protocols typically involve the preparation of a series of 

individual samples in glass NMR tubes and the use of an autosampler to achieve 
reasonable throughput. Variations in volume or positioning that occur during 
sample preparation or tube insertion can necessitate tuning and calibration of the 
probe between each sample, thereby reducing throughput of data collection. 

10 By contrast, flow-injection NMR has several advantages. The stationary 

flow cell provides uniform locking and shimming from one sample to the next, 
and, with the radio frequency coils mounted directly onto the flow cell's glass 
surface, high sensitivity. Fast throughput of data collection is thus possible. Use 
of a liquid handler to prepare and inject samples, such as the Gilson 215 liquid 

1 5 handler used on Bruker and Varian systems, allows the potential for on-the-fly 
sample preparation (A. Ross et al., J, Biomol NMR, 16, 139 (2000)), thus 
maximizing sample integrity and uniformity. Since the use and/or re-use of 
glass NMR tubes is avoided, costs are minimized. 

20 Data Acquisition Hardware and Software 

A typical Flow NMR system consists of a magnet, an NMR console, a 
computer workstation, a Gilson sample handler, and a flow-injection probe. 
Two vendors currently offer complete flow-injection systems: Bruker 
Instruments and Varian Instruments. In addition, the Nalorac Corporation 

25 manufactures an LC probe that can also be used for flow-injection NMR 
screening. A schematic of the Bruker Efficient Transport System (BEST) 
manufactured by Bruker Instruments is shown in Figure 10. The Gilson 215 
sample handler supplied by Bruker is equipped with two Rheodyne 819 valves. 
The first valve is attached to a 5 ml syringe, the needle capillary in the sample 

30 handler injection arm, the bridge capillary, the waste reservoir, and the second 
valve. The second Rheodyne valve is attached to the input and output of the 
probe, the source of nitrogen gas, the first valve, and the injection port. FEP 
Teflon tubing is used in each of the connections with the exception of the gas 
connection, which uses PEEK tubing. 
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A sample is injected into the Bruker probe by filling the needle capillary 
and transferring the sample into the inlet tubing for the probe vising the second 
Rheodyne valve. In quick mode, the next sample is loaded into the tubing 
during the spectral acquisition of the previous sample. When the spectral 
5 acquisition has completed, the first sample exits the probe through the outlet 
capillary. This action pulls the next sample into the probe through the inlet port 
and spectral acquisition can immediately begin. Quick mode acquisition can 
save approximately one minute per sample from the time it would take to load 
each sample individually. However, sample recovery is not currently an option 

10 with this method. In order to recover a sample, each sample is injected 
individually using normal mode acquisition. The sample is recovered by 
selecting either nitrogen gas or the syringe to pull the sample back from the 
probe through the inlet tube. The sample can then be returned to the Gilson 
liquid handler into its original well or into a new 96 well plate. A recovery unit 

1 5 has recently been added to the BEST system to improve the efficiency of 

recovery of the syringe by using the nitrogen gas to create a back pressure on the 
sample. 

Two useful accessories available for the BEST system are a Valvemate 
solvent switcher and a heated transfer line. The solvent switcher was added to 

20 the flow system for the combinatorial chemist who may want to analyze samples 
in various organic solvents, but it can also be used for a library screen to vary 
buffer conditions or to clean the probe out with an acid or a base. The heated 
transfer line is used to equilibrate the sample temperature to the probe 
temperature during sample transfer. Both the inlet and output capillary transfer 

25 lines are threaded through the heated transfer line. This feature is desirable 
when the spectral analysis time is short and a high throughput of samples is 
required. In the ideal case, data acquisition using this accessory can begin 
immediately after the sample enters the probe. Some samples may still require a 
temperature equilibration period after entering the probe. 

30 The setup of the Versatile Automated Sample Transport (VAST) system 

produced by Varian is similar to the Bruker system. The VAST system consists 
of a Gilson 215 liquid handler, a Varian NMR flow probe, an NMR console, and 
a Sun workstation. The Gilson liquid handler supplied by Varian is equipped 
with a single Rheodyne 819 valve and is connected to the NMR flow probe with 
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0.010 inch inside diameter PEEK tubing (P. A. Keifer et al., J. Comb. Chem., 2, 
151 (2000)). In the Varian system design, the sample handler injects a specified 
volume of sample into the probe, the data is acquired, and then the flow of liquid 
through the tubing is reversed and the sample is returned to its original vial or 
5 well. The return of the sample to the Gilson by the syringe pump is assisted by a 
Valco valve and nitrogen gas which supply some backpressure on the outlet 
portion of the Varian flow probe. With the VAST system setup, the probe is 
rinsed just prior to sample injection and then is dried with nitrogen gas to 
minimize dilution of the sample during injection. The Varian design gives 

10 excellent sample recovery without dilution, but it is strongly recommended that 
samples be filtered to prevent clogging of the capillary transfer lines (P. A. 
Keifer et al., J. Comb. Chem., 2, 151 (2000)). 

Flow NMR systems are ideally suited for use with the shielded magnets 
manufactured by Bruker Instruments or Oxford Magnets. Actively shielding a 

1 5 600 MHz magnet reduces the radial 5 gauss line from approximately 4 meters to 
less than 2 meters, which allows the Gilson liquid handler to be placed 
significantly closer to the magnet. This reduces the length of tubing needed 
between the Rheodyne valve and the flow-injection probe and minimizes the 
sample transfer time. The potential for clogging and sample dilution are 

20 concomitantly reduced. 

Bruker uses two software packages to run the BEST system: BEST 
Administrator and ICONNMR (Bruker Instruments, AMEX, BEST and 
ICONNMR software packages). The BEST administrator is activated by typing 
the command 'BESTADM' in XWINNMR. This portion of the software is vised 

25 during method generation and optimization. Samples are injected into the probe 
one at a time and data is collected under XWINNMR. Early versions of the 
BEST software utilized three separate programs: CFBEST, SUBEST, and 
OTBEST. These functions were recently combined under the single software 
package, BEST Administrator. In addition, the parameters available for 

30 customization have been greatly expanded to include automated solvent 

switching and method switching, which were not available in earlier versions of 
the software. The software package ICONNMR is used after a flow method has 
been optimized with the BEST administrator. This package is setup for full 
automation and is the same software used with automated NMR tube sample 
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changers. In a similar fashion, Varian software uses the command 'Gilson' to 
generate a method before sample injection and data acquisition is initiated using 
Enter/Autogo in VNMR (Varian NMR Systems, VNMR software package). 



5 Flow Probe Calibration and System Optimization 

In addition to the normal 90° pulse lengths and power levels which are 
calibrated for any NMR probe, several additional calibrations are required for a 
flow probe. The three additional volumes required to calibrate a Bruker flow 
probe are shown schematically in Figure 1 1 (Bruker Instruments, AMIX, BEST 

10 and ICONNMR software packages). The first volume calibrated is the total 
probe volume. This can be accomplished by injecting a colored liquid into the 
inlet of a dry probe with a syringe and watching for the liquid to appear in the 
outlet port (approximately 700-800 pL). With the Varian system, the system 
filling volume also includes the capillary tubing that connects the injector port to 

15 the flow probe (P. A. Keifer et al., J. Comb. Chem., 2, 151 (2000)). This volume 
is used to calculate the distance required to reposition a sample from the Gilson 
sample handler to the center of the flow cell in the probe. 

The second volume calibrated is the flow cell volume. This is the 
volume of liquid required to fully fill the coil around the flow cell. The three 

20 flow probe vendors (Bruker, Varian, and Nalorac) have probes available with 
active volumes ranging from 30-250 jxL. The stated volume of the flow cell in a 
5 mm Bruker flow probe is 250 pL, but it was calibrated to be approximately 
300 yxL. This volume can be calibrated by making repeated injections of a 
standard sample, starting with a volume less than the stated active volume of the 

25 probe, and collecting a ID *H NMR spectrum. The injection volume can then be 
increased incrementally until no further improvement in signal-to-noise is 
observed. 

In addition to the two probe volume calibrations already discussed, 
Bruker software also includes a third volume for calibration. This volume, 
30 referred to as the positioning volume, is used to optimize the centering of a 

sample in the flow cell. Early versions of ICONNMR software (prior to 3.0.a.9) 
did not include the ability to set the positioning volume. Rather, Bruker 
literature suggested that the flow cell volume should be roughly doubled to 
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insure that the sample would completely fill the coil (Bruker Instruments, 
AMIX, BEST and ICONNMR software packages). Fortunately, this is no longer 
necessary. The positioning volume can now be used to optimize the sample 
position. This calibration reduced the sample size required for injection from 
5 450 jiL in the first few protein screens to 300 \xL for current screens using a 
Bruker 5 mm flow probe with an active volume of 250 \iL. Optimization of this 
parameter minimized the sample volume required for each spectrum. 
Importantly, this significantly reduced the total amount of protein (or other 
target) at a given concentration needed to screen our small molecule library. The 

10 positioning volume can be optimized by collecting a series of spectra on a 

standard sample. In each spectrum collected, the positioning volume can first be 
varied by large increments (50-100 \xL) to get a rough estimate of the volume. 
An example of three such spectra is shown in Figure 12. The positioning 
volume can then be varied in smaller increments (10-25 jiL) to identify the best 

15 volume for this parameter. The best signal-to-noise was obtained for our 5 mm 
Bruker flow probe on a DRX-600 when the positioning volume was set to +25 
HL, but this volume is probe specific and are calibrated for each flow probe. 

The optimization of a flow-injection system for screening has three main 
objectives. The first objective is to transfer an aqueous sample to the center of 

20 the flow cell for analysis using the parameters determined during the flow probe 
calibration described above. The second objective is to reposition a sample from 
the Gilson liquid handler into the flow-injection probe without bubbles and with 
minimal sample dilution. This can be achieved by using nitrogen as a transfer 
gas (which keeps the system under pressure) and by using a series of leading and 

25 trailing solvents. In our experiments, we typically use 150 jxL of 2 H20 as a 

leading solvent, 20 of nitrogen gas, 300 nL of sample, 20 |liL of nitrogen gas, 
andl00^Lof 2 H 2 O as a trailing solvent. Alternatively, a larger volume of 
sample can be used in place of the push solvents. The third objective is to 
determine a cleaning procedure which would reduce sample carry-over to less 

30 than 0.1%. Typically, this involves rinsing the probe with a predetermined 

volume of water. The rinse cycle can also be followed by a dry cycle, in which 
the capillary lines and flow probe are dried with nitrogen gas to further minimize 



27 



sample dilution. In our experiments, we typically use a 1-mL wash volume 
followed by a 30 second drying time with nitrogen gas. 

Design of Small Molecule Screening Libraries 
5 With the increasing prevalence of extremely high throughput screening 

equipment in the pharmaceutical industry, it may seem counter intuitive to 
suggest screening smaller collections of compounds in an NMR-based assay. 
However, a correlation between the quality of hits obtained and the number of 
compounds screened has not been well documented. In fact, compounds are 

10 typically added to screening collections not to simply increase their numbers, but 
to increase the diversity and quality of the compound collection. Thus, if one 
could find suitable hits from a smaller collection of well-chosen compounds, it 
may not be necessary to expend the time and chemical resources to screen the 
entire compound library against every single target. Hits so identified could then 

15 be used to focus further screening efforts or to direct combinatorial syntheses, 
thus saving both time and chemical resources, as shown schematically in Figure 
1 . An NMR-based screen, like other binding assays, has the advantage in that a 
high throughput functional assay does need to be developed. This will become 
increasingly important as more and more targets of interest to pharmaceutical 

20 research are derived from genomics efforts and thus may not have a known 
function that can be assayed. 

Several types of libraries are possible: broad screening libraries 
applicable to many types of target proteins, directed libraries that are designed 
with the common features of an active site in mind that might be useful for 

25 screening a series of targets from the same protein class, such as protease 

enzymes, and "functional genomics" libraries composed of known substrates, 
cofactors and inhibitors for a diverse array of enzymes that might be useful for 
defining the function of genomics-identified targets. 

Ideally, the size and content of a broad screening library should be such 

30 that screening can be accomplished in a day or two with a favorable chance of 
identifying several hits for each of the target proteins to be screened. Rather 
than just randomly choosing a subset library, several rationale approaches have 
been implemented. These include the SHAPES library developed by Fejzo and 
coworkers that is composed largely of molecules that represent frameworks 
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commonly found in known drug molecules (J. Fejzo et al., Chem. Biol., 6, 755 
(1999)), drug-like or lead-like libraries, and diversity-based libraries. A number 
of studies have recently appeared that discuss the properties of known drugs and 
methods to distinguish between drug-like and non-druglike compounds (G. W. 
5 Bemis et al., J. Med. Chem., 39, 2887 (1996); C. A. Lipinski et al., Adv. Drug 
Del. Rev., 23, 3 (1997); Ajay et al., J. Med Chem., 41, 3314 (1998); J. Sadowski 
et al., J. Med. Chem., 41, 3325 (1998); A. K. Ghose et al., J. Comb. Chem., 1, 55 
(1999); J. Wang et al., J. Comb. Chem., 1, 524 (1999); and G. W. Bemis et al., J. 
Med. Chem., 42, 5095 (1999)). Superimposing drug-like (E. J. Martin et al., J. 

10 Comb. Chem., 1, 32 (1999)) or lead-like (S. J. Teague et al., Angew. Chem. Int. 
Ed., 38, 3743 (1999)) properties on a diversity-selected compound set may yield 
the best library of compounds. The distinction of lead-like is important since the 
NMR-based assay is designed to identify weak-affinity compounds that will 
most likely gain molecular weight and lipophilicity to become drug candidates or 

15 even lead chemical templates (S. J. Teague et al., Angew. Chem. Int. Ed., 38, 
3743 (1999)). 

Development and expansion of our lead-like NMR screening library to 
mimic the structural diversity of our larger compound collection has made use of 
the DiverseSolutions software for chemical diversity (R. S. Pearlman et al., 

20 Persp. Drug Disc. Des., 9/10/11, 339 (1998)). In this approach, each compound 
is described by a set of descriptors, which are metrics of chemistry space. Six 
orthogonal descriptors, related to substructures as opposed to the entire 
molecule, are often used. While the descriptors to use can be automatically 
chosen to maximize diversity, typically there are two each corresponding to 

25 charge, polarizability and hydrogen-bonding. A cell-based diversity algorithm is 
employed to divide the descriptor axes into bins and thus into a lattice of 
multidimensional hypercubes. As an example of how this can be used to 
construct or expand a small screening library, consider the selection of 1,000 
compounds from a compound library of 250,000 compounds. First, the cell- 

30 based algorithm is used to partition the 250,000 compounds into approximately 
1,000 cells. The number of compounds per cell will vary and some will be 
empty. Maximum structural diversity will be obtained by taking one compound 
from each occupied cell (and as close to the center as possible). The actual 
compounds chosen are based on desirable lead-like properties such as low 
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molecular weight and hydrophilicity as well as availability and chemical non- 
reactivity as explained below. Diversity voids, as exemplified by empty cells, 
can be filled from external sources or by chemical syntheses if desired. 
Identifying and filling diversity voids is important since larger compound 
5 collections are often heavily weighted in certain classes of compounds stemming 
from earlier research projects. 

An example of diversity-based subset selection using these methods is 
shown in Figure 13. Here, the 6,436 compounds from the Comprehensive 
Medicinal Chemistry index have been divided into 2,012 cells to maximize 

10 diversity using five chemistry-space descriptors. The two-dimensional 

representation projected onto the hydrogen bond acceptor and charge BCUT 
axes is shown in gray. The black squares correspond to the 1,474 lead-like 
compounds (molecular weight less than 350 and 1 < cLogP < 3) contained in the 
CMC index. A total of 806 of the 2,012 cells were occupied by lead-like 

15 compounds. A similar approach could be used to select diverse, lead-like 
compounds from a large corporate compound collection. 

The cell concept of structural space is quite useful after the screening is 
complete. When a hit is identified, other compounds from the same or nearby 
cells are obvious candidates for secondary assays. One can think of this as the 

20 gold mine analogy: when gold is struck, the search is best continued in close 
proximity. 

In addition to structural diversity, there are other characteristics that can 
be considered when selecting the subset molecules. These include purity, 
identity, reactivity, toxicological properties, molecular weight, water solubility, 

25 and suitability for chemical elaboration by traditional or combinatorial methods. 
It makes sense to populate the screening library with compounds of high 
integrity that are not destined for failure down the road. Time spent upfront to 
insure purity and identity with LC-MS or LC-NMR analyses will save resources 
downstream. Filtering tools can be used to avoid compounds that are known to 

30 be highly reactive, toxic, or to have poor metabolic properties. Lack of 

reactivity is important since compounds can be screened more efficiently as 
mixtures. Like other labs (S. B. Shuker et al, Science, 274, 1531 (1996); B. 
Meyer et al., Eur. J. Biochem., 246, 705 (1997); J. Fejzo et al., Chem. Biol, 6, 
755 (1999); and M. Lin et al., J. Org. Chem., 62, 8930 (1997)) we typically pool 



our selected small molecules into mixtures of 6-10 compounds for screening (K. 
A. Farley et aL, SMASH'99, Argonne, IL, 15-18 August 1999). 

Compounds chosen for our diversity library are lead-like as opposed to 
drug-like. It is often the case that chemical elaborations to improve affinity also 
5 increase molecular weight and decrease solubility (S. J. Teague et aL, Angew. 
Chem. Int. Ed., 38, 3743 (1999)). The molecular weight of the compounds 
therefore should preferably not exceed about 350. Since most hits obtained will 
have affinities for their target in the approximately 100 \\M range, low molecular 
weight will leave room for chemical elaboration to build in more affinity and 

1 0 selectivity. Using larger molecular weight drug-like compounds would not 

substantially improve affinity of the hits and could easily preclude obtaining lead 
chemical templates of reasonable size. Lead-like hits that are reasonably water 
soluble allow for chemical elaboration that results in modest increased 
lipophilicity of the final therapeutic entity (S. J. Teague et aL, Angew. Chem. Int. 

15 Ed., 38, 3743 (1999)). Water solubility is also important since it enhances the 
potential success of downstream studies such as calorimetry, enzymology, co- 
crystallization and NMR structural studies. Compound solubility is especially 
important for flow-injection NMR methods in order to prevent clogging of the 
capillary lines. 

20 Compounds should also be chosen with their suitability for chemical 

elaboration by traditional or combinatorial chemistry methods in mind. Hits 
with facile handles for synthetic chemistry will be of more interest and will 
allow more efficient use of often limited medicinal chemistry resources. 

25 Relaxation-Edited or WaterLOGS Y-Based Flow-Injection NMR Screening 
Methods 

Calibration and validation of the flow system and creation of a small- 
molecule screening library yields an automated system that is ready to screen 
new targets. A protein target can be analyzed for protein-ligand interactions 
30 using relaxation-editing methods by adding sufficient protein to each well of the 
96- well library plate to give a 1 :1 (protein:ligand) ratio at a concentration of 
approximately 50 jiM. Homogeneous sample dispersion throughout the well can 
be facilitated by agitating the plate on a flat bed shaker. Screening at this 
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concentration allows a decent ID *H NMR spectrum to be acquired in about 10 
minutes. In our experience, this concentration of target and small molecule 
requires identified ligands to have affinities on the order of approximately 200 
or tighter. 

5 Once the screening plate has been prepared, the Gilson liquid sample 

handler transfers samples from 96-well plates into the flow-injection probe and if 
desired, returns the samples back into either the original 96-well plate or a new 
plate. Once the sample is in the magnet, spectra that can detect changes in 
chemical shifts, relaxation properties, or diffusion properties can be collected. In 

10 our relaxation-edited NMR screening assay, two ID relaxation-edited *H NMR 
spectra are collected: one spectrum is collected on the ligand mixture in the 
presence of protein and the second, control spectrum is collected on the ligand 
mixture in the absence of protein. Ligands are identified as binding to a target 
when their resonances are greatly reduced when compared to a relaxation-edited 

15 spectrum collected in the absence of protein as illustrated in Figure 14. In this 
example, the target protein was a genomics-derived protein of unknown 
function. 

When binding is detected using the WaterLOGS Y technique, sample 
preparation and use of the flow-injection apparatus is identical, except that 

20 extremely low levels of target are used (1-10 jaM) with ratios of ligand to target 
of 100:1 to 10:1. For data analysis, binding compounds are distinguished from 
nonbinders by the opposite sign of their water-ligand NOEs. In contrast to the 
relaxation-edited technique, only a single WaterLOGS Y spectrum is used for 
each ligand mixture. There is no need to collect a reference spectrum in the 

25 absence of target protein. An example is illustrated in Figure 1 5 for a mixture of 
compounds and a 37 kDa protein. In the WaterLOGS Y spectrum shown in 
Figure 15, binding compounds have resonances of opposite intensity (sharp 
positive peaks) than nonbinders (near zero intensity or sharp negative peaks). 
Residual protein resonances are also of positive intensity. 

30 Ligand binding can be confirmed by collecting a ID relaxation-edited ! H 

NMR spectrum of each individual ligand that was identified as binding to the 
protein in a given mixture as shown in Figure 16. In addition, the binding 
constant of the protein/ligand interaction can be estimated using ID diffusion- 
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edited spectra of the ligand in the presence and absence of protein (A. J. Lennon 
et al., Biophys. J., 67, 2096 (1994)). If labeled protein is available, a 2D ! H- 1$ N 
HSQC spectrum can also be obtained to locate the ligand binding site on the 
protein (J. Wang et al., Biochemistry, 31, 921 (1992); and S. B. Shuker et al, 
5 Science, 274, 1531 (1996)). In cases where the protein is small enough and 
structural characterization of the binding interaction is desired, further 
experiments can be carried out using 15 N and/or 13 C/ 15 N protein/ligand 
complexes. 

10 Data Analysis 

The development of flow probes has facilitated the transition to high- 
throughput NMR and has made possible the routine collection of tremendous 
volumes of data. Recent software developments have advanced the automated 
handling of large data sets collected on combinatorial chemistry libraries (P. A. 

15 Keifer et al., J. Comb. Chem., 2, 151 (2000); Bruker Instruments, AMIX, BEST 
and ICONNMR software packages; Varian NMR Systems, VNMR software 
package; and Williams A, Book of Abstracts, 218th ACS National Meeting 
(1999)). Visualization of results in a 96- well format allows rapid evaluation of 
the data sets. The integration of features such as this into a software package 

20 tailored more for data reduction and evaluation of library screening data sets 

parallels the combinatorial chemistry software development but remains slightly 
behind. However, recent advancements that have been made for combinatorial 
chemistry data analyses portend similar developments for the automation of 
protein binding screening data. 

25 In our ID relaxation-edited *H NMR data sets, one can simply identify 

the ligand resonances by inspection since their intensity is reduced in the 
presence of protein as shown in Figure 14. In our WaterLOGSY data sets, 
binding compounds are distinguished from nonbinders by the opposite sign of 
their water-ligand NOEs as observed in Figure 15. In either case, comparison to 

30 an assigned small molecule control spectrum are made to identify the compound 
associated with the indicated resonances. 

Other labs have relied on difference spectra to analyze relaxation- or 
diffusion-edited ID *H NMR data sets (P. J. Hajduk et al., J. Am, Chem. Soc, 
119, 12257 (1997); N. Gonnella et al., J. Magn. Reson., 131, 336 (1998); and A. 



Chen et al., J. Am. Chem. Soc, 122, 414 (2000)). After a series of spectral 
subtractions, the resulting spectrum represents the resonances of the compounds 
that bind to the protein. Two factors that pose problems are line broadening and 
shifting resonances, both of which can lead to subtraction artifacts. Changes in 
5 intensity can also add the need for a scaling factor in the data analysis step. 
These additional steps, which can vary from one spectra to the next, make 
strategies for automated data analysis complex. 

Data analysis for 2D screening methods typically involves either the 
analysis of protein chemical shift perturbations indicative of ligand binding (A. 

10 Ross et al., J. BiomoL NMR, 16, 139 (2000); and S. B. Shuker et al, Science, 
274, 1531 (1996)), or the analysis of changes in signals from the small 
molecules in NOE or DECODES spectra indicative of binding (B. Meyer et al., 
Eur: J. Biochem., 246, 705 (1997); J. Fejzo et al., Chem. BioL, 6, 755 (1999); 
and M. Lin et al., J. Am. Chem. Soc, 119, 5249 (1997)). While a series of 2D 

15 *H- 15 N HSQC spectra can be compared manually, automated analysis using both 
non-statistical and statistical approaches of a series of *H- 15 N HSQC spectra 
acquired with flow-injection NMR methods was recently demonstrated (A. Ross 
et al., J. BiomoL NMR, 16, 139 (2000)). AMIX was used for the non-statistical 
analysis by comparing spectra collected in the presence of single compounds to 

20 the reference spectrum of the protein alone. Then, using bucketing calculations 
for data reduction, a table ranked by the correlation coefficient was generated. 
No correlations were observed using the bucketing calculations alone. 
Subsequently, integration patterns for all 300 small molecule spectra were 
analyzed by AMIX to generate a data matrix of N integration regions times 300. 

25 A statistical software package, UNSCRAMBLER 6.0, was then used to analyze 
this data matrix using principal components analysis. Two classes of spectral 
changes were observed. Ultimately, one class was found to correspond to pH 
changes caused by certain small molecules while the other class corresponded to 
small molecules binding to the target protein (A. Ross et al., J. BiomoL NMR, 

30 16, 139(2000)). 

Data reduction is an important aspect for handling the amounts of data 
generated if high-throughput screening by NMR is to be successful. Non- 
statistical methods such as the bucketing calculations of AMIX (Bruker 
Instruments, AMIX, BEST and ICONNMR software packages) or the database 
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comparisons of ACD (Williams A, Book of Abstracts, 21 8th ACS National 
Meeting (1999)) compare chemical shift, multiplicity, integration regions and 
patterns to give correlation factors between spectra. These software packages can 
be used for data reduction of both one- and two-dimensional data. Prediction 
5 software is also available to help aid in interpretation of data sets. Statistical 
methods such as principal components analysis can be used to analyze data for 
other correlations that are not apparent using non-statistical methods alone. In 
the case of 2D ! H- 15 N HSQC data, an adaptive, multivariate method that 
incorporates a weighted mapping of perturbations to correlate information within 
10 a spectrum or across many spectra has also been described (F. Delaglio, CHI 
Conference on NMR Technologies: Development and Applications for Drug 
Discovery, Baltimore, MD, 4-5 November 1999). 

Comparison of Flow vs. Traditional Methods 

1 5 The advantage of working with samples in the flow NMR screening 

environment is that each set of spectra are collected on samples that are at the 
same concentration. This accelerates spectral acquisition considerably. Since 
the samples are fairly homogenous, many of the routine tasks need to be 
completed on only the first sample: probe tuning, *H 90° pulse calibration, 

20 receiver gain, number of transients, locking, and gradient shimming. On 

subsequent samples, these steps can be omitted, although simplex shimming of 
Zi and Z2 can still be used with multi-day acquisitions. 

Prerequisites for a high-throughput assay include rapid data collection, 
sample-to-sample integrity and minimal costs. Flow NMR techniques have been 

25 developed with each in mind. For ID *H NMR screening experiments, the 
process of removing the previous sample from the flow cell, rinsing the flow 
cell, injecting the next sample, allowing for thermal equilibration, automating 
solvent suppression and acquiring the data can take less than 10 minutes. In 
practice, the use of this procedure is two to three times faster than a sample 

30 changer with conventional NMR tubes. If compounds were screened in mixtures 
of 1 0, this results in a throughput of about 1 ,500 compounds per day. Use of a 
liquid handler, such as the Gilson 215 typically employed by Bruker and Varian 
flow NMR systems, can simplify the preparation of samples as well. Ross and 
coworkers have demonstrated on-the-fly sample preparation by using the liquid 
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handler to mix the protein to be screened with the small molecule immediately 
prior to injection (A. Ross et al., J. Biomol NMR, 16, 139 (2000)). Sample 
conditions can thus be highly standardized with the resulting spectra very 
consistent and reproducible. Even if target protein is added manually to pre- 
plated screening libraries, the amount of pipetting is still less than if using NMR 
tubes. Recurring expenses associated with purchasing and/or cleaning NMR 
tubes are eliminated with flow-injection NMR methods. The cost of the 96-well 
microtitre plates is insignificant compared to NMR tubes. 

The complete disclosures of the patents, patent documents, and 
publications cited herein are incorporated by reference in their entirety as if each 
were individually incorporated. Various modifications and alterations to this 
invention will become apparent to those skilled in the art without departing from 
the scope and spirit of this invention. It should be understood that this invention 
is not intended to be unduly limited by the illustrative embodiments and 
examples set forth herein. Such examples and embodiments are presented by 
way of example only with the scope of the invention intended to be limited only 
by the claims set forth herein as follows. 
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